Cost-sensitive feature acquisition and classification

نویسندگان

  • Shihao Ji
  • Lawrence Carin
چکیده

There are many sensing challenges for which one must balance the effectiveness of a given measurement with the associated sensing cost. For example, when performing a diagnosis a doctor must balance the cost and benefit of a given test (measurement), and the decision to stop sensing (stop performing tests) must account for the risk to the patient and doctor (malpractice) for a given diagnosis based on observed data. This motivates a cost-sensitive classification problem in which the features (sensing results) are not given a priori ; the algorithm determines which features to acquire next, as well as when to stop sensing and make a classification decision based on previous observations (accounting for the costs of various types of errors, as well as the rewards of being correct). We formally define the cost-sensitive classification problem and solve it via a partially observable Markov decision process (POMDP). While the POMDP constitutes an intuitively appealing formulation, the intrinsic properties of classification tasks resist application of it to this problem. We circumvent the difficulties of the POMDP via a myopic approach, with an adaptive stopping criterion linked to the standard POMDP. The myopic algorithm is computationally feasible, easily handles continuous features, and seamlessly avoids repeated actions. Experiments with several benchmark datasets show that the proposed method yields state-of-the-art performance, and importantly our method uses only a small fraction of the features that are generally used in competitive approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

Coarse-to-Fine, Cost-Sensitive Classification of E-Mail

In many real-world scenarios, it is necessary to make judgments at differing levels of granularity due to computational constraints. Particularly when there are a large number of classifications that must be done in a real-time streaming setting and there is a significant difference in the time required to acquire different subsets of features, it is important to have an intelligent strategy fo...

متن کامل

Proposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms

In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...

متن کامل

A New Formulation for Cost-Sensitive Two Group Support Vector Machine with Multiple Error Rate

Support vector machine (SVM) is a popular classification technique which classifies data using a max-margin separator hyperplane. The normal vector and bias of the mentioned hyperplane is determined by solving a quadratic model implies that SVM training confronts by an optimization problem. Among of the extensions of SVM, cost-sensitive scheme refers to a model with multiple costs which conside...

متن کامل

Feature Extraction of Visual Evoked Potentials Using Wavelet Transform and Singular Value Decomposition

Introduction: Brain visual evoked potential (VEP) signals are commonly known to be accompanied by high levels of background noise typically from the spontaneous background brain activity of electroencephalography (EEG) signals. Material and Methods: A model based on dyadic filter bank, discrete wavelet transform (DWT), and singular value decomposition (SVD) was developed to analyze the raw data...

متن کامل

ACE-Cost: Acquisition Cost Efficient Classifier by Hybrid Decision Tree with Local SVM Leaves

The standard prediction process of SVM requires acquisition of all the feature values for every instance. In practice, however, a cost is associated with the mere act of acquisition of a feature, e.g. CPU time needed to compute the feature out of raw data, the dollar amount spent for gleaning more information, or the patient wellness sacrificed by an invasive medical test, etc. In such applicat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Pattern Recognition

دوره 40  شماره 

صفحات  -

تاریخ انتشار 2007